61 research outputs found
NeurJSCC Enabled Semantic Communications: Paradigms, Applications, and Potentials
Recent advances in deep learning have led to increased interest in solving
high-efficiency end-to-end transmission problems using methods that employ the
nonlinear property of neural networks. These techniques, we call neural joint
source-channel coding (NeurJSCC), extract latent semantic features of the
source signal across space and time, and design corresponding variable-length
NeurJSCC approaches to transmit latent features over wireless communication
channels. Rapid progress has led to numerous research papers, but a
consolidation of the discovered knowledge has not yet emerged. In this article,
we gather diverse ideas to categorize the expansive aspects on NeurJSCC as two
paradigms, i.e., explicit and implicit NeurJSCC. We first focus on those two
paradigms of NeurJSCC by identifying their common and different components in
building end-to-end communication systems. We then focus on typical
applications of NeurJSCC to various communication tasks. Our article highlights
the improved quality, flexibility, and capability brought by NeurJSCC, and we
also point out future directions
Variational Speech Waveform Compression to Catalyze Semantic Communications
We propose a novel neural waveform compression method to catalyze emerging
speech semantic communications. By introducing nonlinear transform and
variational modeling, we effectively capture the dependencies within speech
frames and estimate the probabilistic distribution of the speech feature more
accurately, giving rise to better compression performance. In particular, the
speech signals are analyzed and synthesized by a pair of nonlinear transforms,
yielding latent features. An entropy model with hyperprior is built to capture
the probabilistic distribution of latent features, followed with quantization
and entropy coding. The proposed waveform codec can be optimized flexibly
towards arbitrary rate, and the other appealing feature is that it can be
easily optimized for any differentiable loss function, including perceptual
loss used in semantic communications. To further improve the fidelity, we
incorporate residual coding to mitigate the degradation arising from
quantization distortion at the latent space. Results indicate that achieving
the same performance, the proposed method saves up to 27% coding rate than
widely used adaptive multi-rate wideband (AMR-WB) codec as well as emerging
neural waveform coding methods
Communication Beyond Transmitting Bits: Semantics-Guided Source and Channel Coding
Classical communication paradigms focus on accurately transmitting bits over
a noisy channel, and Shannon theory provides a fundamental theoretical limit on
the rate of reliable communications. In this approach, bits are treated
equally, and the communication system is oblivious to what meaning these bits
convey or how they would be used. Future communications towards intelligence
and conciseness will predictably play a dominant role, and the proliferation of
connected intelligent agents requires a radical rethinking of coded
transmission paradigm to support the new communication morphology on the
horizon. The recent concept of "semantic communications" offers a promising
research direction. Injecting semantic guidance into the coded transmission
design to achieve semantics-aware communications shows great potential for
further breakthrough in effectiveness and reliability. This article sheds light
on semantics-guided source and channel coding as a transmission paradigm of
semantic communications, which exploits both data semantics diversity and
wireless channel diversity together to boost the whole system performance. We
present the general system architecture and key techniques, and indicate some
open issues on this topic.Comment: IEEE Wireless Communications, text overlap with arXiv:2112.0309
Improved Nonlinear Transform Source-Channel Coding to Catalyze Semantic Communications
Recent deep learning methods have led to increased interest in solving
high-efficiency end-to-end transmission problems. These methods, we call
nonlinear transform source-channel coding (NTSCC), extract the semantic latent
features of source signal, and learn entropy model to guide the joint
source-channel coding with variable rate to transmit latent features over
wireless channels. In this paper, we propose a comprehensive framework for
improving NTSCC, thereby higher system coding gain, better model versatility,
and more flexible adaptation strategy aligned with semantic guidance are all
achieved. This new sophisticated NTSCC model is now ready to support large-size
data interaction in emerging XR, which catalyzes the application of semantic
communications. Specifically, we propose three useful improvement approaches.
First, we introduce a contextual entropy model to better capture the spatial
correlations among the semantic latent features, thereby more accurate rate
allocation and contextual joint source-channel coding are developed accordingly
to enable higher coding gain. On that basis, we further propose response
network architectures to formulate versatile NTSCC, i.e., once-trained model
supports various rates and channel states that benefits the practical
deployment. Following this, we propose an online latent feature editing method
to enable more flexible coding rate control aligned with some specific semantic
guidance. By comprehensively applying the above three improvement methods for
NTSCC, a deployment-friendly semantic coded transmission system stands out
finally. Our improved NTSCC system has been experimentally verified to achieve
considerable bandwidth saving versus the state-of-the-art engineered VTM + 5G
LDPC coded transmission system with lower processing latency
Wireless Deep Speech Semantic Transmission
In this paper, we propose a new class of high-efficiency semantic coded
transmission methods for end-to-end speech transmission over wireless channels.
We name the whole system as deep speech semantic transmission (DSST).
Specifically, we introduce a nonlinear transform to map the speech source to
semantic latent space and feed semantic features into source-channel encoder to
generate the channel-input sequence. Guided by the variational modeling idea,
we build an entropy model on the latent space to estimate the importance
diversity among semantic feature embeddings. Accordingly, these semantic
features of different importance can be allocated with different coding rates
reasonably, which maximizes the system coding gain. Furthermore, we introduce a
channel signal-to-noise ratio (SNR) adaptation mechanism such that a single
model can be applied over various channel states. The end-to-end optimization
of our model leads to a flexible rate-distortion (RD) trade-off, supporting
versatile wireless speech semantic transmission. Experimental results verify
that our DSST system clearly outperforms current engineered speech transmission
systems on both objective and subjective metrics. Compared with existing neural
speech semantic transmission methods, our model saves up to 75% of channel
bandwidth costs when achieving the same quality. An intuitive comparison of
audio demos can be found at https://ximoo123.github.io/DSST
WITT: A Wireless Image Transmission Transformer for Semantic Communications
In this paper, we aim to redesign the vision Transformer (ViT) as a new
backbone to realize semantic image transmission, termed wireless image
transmission transformer (WITT). Previous works build upon convolutional neural
networks (CNNs), which are inefficient in capturing global dependencies,
resulting in degraded end-to-end transmission performance especially for
high-resolution images. To tackle this, the proposed WITT employs Swin
Transformers as a more capable backbone to extract long-range information.
Different from ViTs in image classification tasks, WITT is highly optimized for
image transmission while considering the effect of the wireless channel.
Specifically, we propose a spatial modulation module to scale the latent
representations according to channel state information, which enhances the
ability of a single model to deal with various channel conditions. As a result,
extensive experiments verify that our WITT attains better performance for
different image resolutions, distortion metrics, and channel conditions. The
code is available at https://github.com/KeYang8/WITT
Adaptive Semantic Communications: Overfitting the Source and Channel for Profit
Most semantic communication systems leverage deep learning models to provide
end-to-end transmission performance surpassing the established source and
channel coding approaches. While, so far, research has mainly focused on
architecture and model improvements, but such a model trained over a full
dataset and ergodic channel responses is unlikely to be optimal for every test
instance. Due to limitations on the model capacity and imperfect optimization
and generalization, such learned models will be suboptimal especially when the
testing data distribution or channel response is different from that in the
training phase, as is likely to be the case in practice. To tackle this, in
this paper, we propose a novel semantic communication paradigm by leveraging
the deep learning model's overfitting property. Our model can for instance be
updated after deployment, which can further lead to substantial gains in terms
of the transmission rate-distortion (RD) performance. This new system is named
adaptive semantic communication (ASC). In our ASC system, the ingredients of
wireless transmitted stream include both the semantic representations of source
data and the adapted decoder model parameters. Specifically, we take the
overfitting concept to the extreme, proposing a series of ingenious methods to
adapt the semantic codec or representations to an individual data or channel
state instance. The whole ASC system design is formulated as an optimization
problem whose goal is to minimize the loss function that is a tripartite
tradeoff among the data rate, model rate, and distortion terms. The experiments
(including user study) verify the effectiveness and efficiency of our ASC
system. Notably, the substantial gain of our overfitted coding paradigm can
catalyze semantic communication upgrading to a new era
Wireless Deep Video Semantic Transmission
In this paper, we design a new class of high-efficiency deep joint
source-channel coding methods to achieve end-to-end video transmission over
wireless channels. The proposed methods exploit nonlinear transform and
conditional coding architecture to adaptively extract semantic features across
video frames, and transmit semantic feature domain representations over
wireless channels via deep joint source-channel coding. Our framework is
collected under the name deep video semantic transmission (DVST). In
particular, benefiting from the strong temporal prior provided by the feature
domain context, the learned nonlinear transform function becomes temporally
adaptive, resulting in a richer and more accurate entropy model guiding the
transmission of current frame. Accordingly, a novel rate adaptive transmission
mechanism is developed to customize deep joint source-channel coding for video
sources. It learns to allocate the limited channel bandwidth within and among
video frames to maximize the overall transmission performance. The whole DVST
design is formulated as an optimization problem whose goal is to minimize the
end-to-end transmission rate-distortion performance under perceptual quality
metrics or machine vision task performance metrics. Across standard video
source test sequences and various communication scenarios, experiments show
that our DVST can generally surpass traditional wireless video coded
transmission schemes. The proposed DVST framework can well support future
semantic communications due to its video content-aware and machine vision task
integration abilities.Comment: published in IEEE JSA
A loss-of-function variant in SSFA2 causes male infertility with globozoospermia and failed oocyte activation
Globozoospermia (OMIM: 102530) is a rare type of teratozoospermia ( A; p.R1224Q) in the patient. This variant significantly reduced the protein expression of SSFA2. Immunofluorescence staining showed posi- tive SSFA2 expression in the acrosome of human sperm. Liquid chromatography–mass spectrometry/mass spectrom- etry (LC–MS/MS) and Coimmunoprecipitation (Co-IP) analyses identified that GSTM3 and Actin interact with SSFA2.
Further investigation revealed that for the patient, regular intracytoplasmic sperm injection (ICSI) treatment had a poor prognosis. However, Artificial oocyte activation (AOA) by a calcium ionophore (A23187) after ICSI successfully rescued the oocyte activation failure for the patient with the SSFA2 variant, and the couple achieved a live birth. This study revealed that SSFA2 plays an important role in acrosome formation, and the homozygous c.3671G > A loss-of-function variant in SSFA2 caused globozoospermia. SSFA2 may represent a new gene in the genetic diagnosis of globozoospermia, especially the successful outcome of AOA-ICSI treatment for couples, which has potential value for clinicians in their treatment regimen selections
- …